This report documents unsupervised learning analyses of the model residuals for traits measured in white and black REGARDS subjects (documented in the ‘baseline.csv’ file downloaded from Suraju Sadeeq’s OneDrive). The following outcomes will be considered for inclusion as traits:
eGFR, documented as ‘EGFR_CKDEPI’, left ventricular hypertrophy: ‘lvh_main’, atrial fibrillation: ‘Afib_SR_ECG’, diabetes: ‘Diab_SRMed_glu’, lipidemia: ‘lipidemia_meds_labs’, myocardial infarction: ‘MI_SR_ECG’, CAD: ‘CAD_SR_ECG’, insulin, creatinine level: ‘Creatinine_urine’, albumin level: ‘Albumin_urine’, cystatin C level: ‘Cysc’, C-reactive protein: ‘Crp’, triglycerides: ‘Trigly’, glucose, LDL, HDL, DBP, SBP, cholesterol: ‘Cholest’, heart rate: ‘Heartrate’, stroke: ‘Stroke_SR’, depression (as scale): ‘CESD’, hypertension diagnosis: “Hyper_SRmeds_BP”, TIA: “TIA_SR”, peripheral artery disease surgery: “PAD_surgery”, peripheral artery disease amputation: “PAD_amputation”, kidney failure: “KidneyFailure_SR”, waking at night due to breathing difficutly: “HF_WakeNight”, binary report of falls in the past year: “Falls”, self-reported DVT: “DVT_SR”, self-reported dialysis: “Dialysis_SR”, SF-12 mental: “MCS”, SF-12 physical: “PCS”, perceived stress scale: “PSS”, cancer diagnosis: “Cancer”.
Some variables are excluded based on inspection. CESD is highly skewed and does not appear to cohere to the Center for Epidemiological Studies - Depression scale, so it will be dropped. In this case, no subject has CESD>12, whereas the CES-D scale ranges from 0 to 60, and individuals are considered at risk for depression when the score exceeds 16.
Report the missing rate for each outcome. Variables with missingness > 5% will be dropped, which includes the following variables: insulin, TIA, cancer.
| Outcome | MissingProp |
|---|---|
| EGFR_CKDEPI | 0.025 |
| lvh_main | 0.015 |
| Diab_SRMed_glu | 0.025 |
| Lipidemia_meds_labs | 0.027 |
| Hyper_SRmeds_BP | 0.002 |
| Afib_SR_ECG | 0.024 |
| MI_SR_ECG | 0.018 |
| CAD_SR_ECG | 0.019 |
| insulin | 0.268 |
| Creatinine_urine | 0.047 |
| Albumin_urine | 0.049 |
| DBP | 0.003 |
| SBP | 0.003 |
| Cysc | 0.043 |
| Crp | 0.042 |
| Trigly | 0.026 |
| Glucose | 0.025 |
| Ldl | 0.041 |
| Hdl | 0.032 |
| Cholest | 0.025 |
| Heartrate | 0.012 |
| TIA_SR | 0.068 |
| Stroke_SR | 0.003 |
| PAD_surgery | 0.002 |
| PAD_amputation | 0.000 |
| KidneyFailure_SR | 0.006 |
| HF_WakeNight | 0.003 |
| Falls | 0.003 |
| DVT_SR | 0.003 |
| Dialysis_SR | 0.006 |
| MCS | 0.050 |
| PCS | 0.050 |
| PSS | 0.000 |
| Cancer | 0.370 |
The following variables will be considered as covariates in the analysis: alcohol use ‘Alc_Use’, gender ‘Gender_x’, age ‘Age_x’, smoking ‘Smoke’, education ‘ED_Cat’, income ‘Income’, weight ‘Weight’, and race ‘Race_x’.
Report the missing rate for each covariate. Income is the only covariate missing at >10%; it will be dropped, and the other seven retained. A complete case analysis will be performed on all individuals non-missing for both the covariates and the outcomes.
| Covariate | MissingProp |
|---|---|
| Weight | 0.000 |
| Smoke | 0.004 |
| Alc_Use | 0.000 |
| ED_Cat | 0.001 |
| Income | 0.123 |
| Gender_x | 0.000 |
| Race_x | 0.000 |
| Age_x | 0.000 |
PRS scores will be included as a covariate for traits where it is available. PRSs are available in white and black subjects for the following traits: eGFR (PGS000303), CAD (PGS000011), albumin (PGS000669), C-reactive protein (PGS000314), triglycerides (PGS000066), LDL (PGS000061), DBP (PGS000302), SBP (PGS000301), TC (PGS000062), and heart rate (PGS000300).
Note: the PGS for glucose (PGS000684) is present for white subjects but not black subjects.
The complete case analysis consists of 8559 subjects.
The following outcomes are binary (and thus, logistic regression will be used): LVH, Diabetes, Lipidemia, AFib, MI, CAD, Stroke, hypertension, PAD surgery, PAD amputation, kidney failure, nighttime waking, reported falls, DVT, and dialysis. The frequency of the less common category for these binary outcomes is described below. Outcomes with frequency less than 2% will be dropped, namely: PAD surgery, PAD amputation, Dialysis.
| Outcome | RareProp |
|---|---|
| lvh_main | 0.133 |
| Diab_SRMed_glu | 0.265 |
| Lipidemia_meds_labs | 0.449 |
| Hyper_SRmeds_BP | 0.315 |
| Afib_SR_ECG | 0.075 |
| MI_SR_ECG | 0.105 |
| CAD_SR_ECG | 0.142 |
| Stroke_SR | 0.057 |
| PAD_surgery | 0.018 |
| PAD_amputation | 0.003 |
| KidneyFailure_SR | 0.023 |
| HF_WakeNight | 0.091 |
| Falls | 0.146 |
| DVT_SR | 0.050 |
| Dialysis_SR | 0.004 |
Heatmap of correlation between outcome variables.
## Correlation computed with
## • Method: 'pearson'
## • Missing treated using: 'pairwise.complete.obs'
A final list of all 28 outcome variables that pass QC and are used in modeling:
## [1] "eGFR" "LVH" "AFib" "Diabetes"
## [5] "Lipidemia" "MI" "CAD" "Creatinine"
## [9] "Albumin" "CysC" "CRP" "TG"
## [13] "Glucose" "LDL" "HDL" "DBP"
## [17] "SBP" "TC" "HR" "Stroke"
## [21] "Hypertension" "KidneyFailure" "Waking" "Falls"
## [25] "DVT" "MCS" "PCS" "PSS"
Calculate the coefficient of determination for each of the models to give a sense of how predictive each covariate is. For the logistic models, use AUROC. These plots are interpreted as follows: the point label represents the predictive accuracy of the model without that variable included.
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Assess clustering and PCs for full set of residuals.
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
Assess clustering and PCs for model residuals with each of the seven covariates held out.
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## [1] "Adjusted rand index, no Alcohol: 0.023"
## [1] "No Alcohol table of clustering results"
## tempCovarValues
## Current Never Past
## 1 2521 1554 974
## 2 974 913 665
## 3 378 317 263
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 427950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 427950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## [1] "Adjusted rand index, no Gender: 0.011"
## [1] "No Gender table of clustering results"
## tempCovarValues
## F M
## 1 4299 3067
## 2 585 608
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 427950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## [1] "Adjusted rand index, no Smoking: 0.01"
## [1] "No Smoking table of clustering results"
## tempCovarValues
## Current Never Past
## 1 796 2396 1846
## 2 480 1092 986
## 3 193 370 400
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 427950)
## Warning: did not converge in 10 iterations
## [1] "Adjusted rand index, no Education: 0.016"
## [1] "No Education table of clustering results"
## tempCovarValues
## College graduate and above High school graduate Less than high school
## 1 625 768 564
## 2 49 67 49
## 3 1611 1219 539
## 4 264 241 173
## tempCovarValues
## Some college
## 1 749
## 2 49
## 3 1341
## 4 251
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 427950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 427950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## [1] "Adjusted rand index, no Race: 0.034"
## [1] "No Race table of clustering results"
## tempCovarValues
## B W
## 1 6189 1164
## 2 949 257
More in-depth investigation of gender.
## Setting levels: control = F, case = M
## Setting direction: controls < cases
## [1] "AUC for PC1 predicting gender: 0.53"
## Setting levels: control = F, case = M
## Setting direction: controls < cases
## [1] "AUC for PC2 predicting gender: 0.54"
## Setting levels: control = F, case = M
## Setting direction: controls < cases
## [1] "AUC for PC3 predicting gender: 0.58"
## Setting levels: control = F, case = M
## Setting direction: controls < cases
## [1] "AUC for PC4 predicting gender: 0.61"
## Setting levels: control = F, case = M
## Setting direction: controls > cases
## [1] "AUC for PC5 predicting gender: 0.56"